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1. Introduction 

CTA will be the first open observatory of very-high-energy y-rays. It will be the successor 
of the current generation of ground-based imaging atmospheric Cherenkov telescope (lACT) ex¬ 
periments. Arrays may combine up to six types of telescope and up to seven types of cameras. 
This design exceeds the dimension and complexity of the current lACT experiments, which are 
formed by a maximum of 5 telescopes and with no more than two telescope and camera types. The 
telescopes will record Cherenkov light coming from the extensive air showers (EAS) produced by 
primary y-rays and, mostly, by cosmic rays (CR). The high expected trigger rates of several tens of 
kHz, together with the ~ 10^ to 10"* pixels per camera, will lead to huge CTA raw data rates [1]. 
These data rates must be processed by the CTA Pipelines [2] during the expected life time (~ 30 
years) of the CTA Observatory (CTAO). This contribution presents the Modular Efficient Simple 
System (MESS), a CTA Pipeline prototype. It takes into account the data challenges of the CTAO: 

• The open observatory condition of CTA requires a more robust framework compared to cur¬ 
rent experiments. CTAO has to provide consistent and reproducible results to the astronomi¬ 
cal community. 

• Due to the long lifetime of the observatory, long-term supported libraries/system will min¬ 
imise the software maintenance costs. 

• The diversity of technology within the CTAO will demand a clear modularity between the 
different software components. Being able to run a variety of pipelines without recompiling 
any program is an advantage and increases the reproducibility of results. 

The next sections will describe the MESS prototype and preliminary results when applied to CTA 
Monte Carlo (MC) data. 

2. Software framework 

The idea behind this proposal is that a software framework should allow developers to concen¬ 
trate on their algorithms and not care about learning new paradigms and loose time doing redundant 
work. The MESS software framework provides a robust library with clean interfaces, minimal de¬ 
pendencies and complexity, built with proven and well-known systems. It comes with tools for 
automating the redundant development processes and for creating documentation. 

2.1 Dependencies 

Dependencies should be minimised, because the fewer there are, the easier it is to build and 
maintain the software. The MESS library is written in plain C and has no dependencies except 
CFITSIO [3], which is used for storing tabular data in EITS [4] files. Whenever required, different 
libraries and programs can be involved, but such dependencies are only per single module (shared 
object file) and do not affect the rest of the system. The module readctamc, which reads raw 
CTA MC data, needs to be linked to hessioxxx (a library that is part of the SIMTELARRAY [6] 
package), which also does the pixel calibration. Other MESS programs need to be linked to GSL [7] 
and PLPLOT [8] to allow plotting histograms and displaying events. 
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2.2 Complexity 

Library functions should be orthogonal to each other and it should be possible to combine 
them in a coherent and straightforward way. Few different data types should be enough to repre¬ 
sent the problems. The current MESS framework version, with basic lACT analysis algorithms 
implemented, has 4000 lines of code (without comments), which makes it easy to understand and 
to maintain. The library and basic modules are compiled in less than 3 s. 

2.3 Build System 

MESS uses a global Makefile that dynamically includes all Makefiles in fhe subdirecfory, in 
order fo keep dependencies separated. When a new module or a new program is added, if is enough 
fo pul jusl ils name info fhe appropriale Makefile. This way, exfernal libraries and readers/wrilers 
for exfernal formals can be integrated in a clean way. 

2.4 Autogenerated header files and documentation 

Writing and maintaining header files is not necessary anymore: MESS provides the program 
c2h, which scans through all source code files (. c / . cxx) and creates header files (. h) from 
them, including comments above functions, variables, type definitions etc. Another program in 
MESS, h2txt, reads these autogenerated header tiles and converts them to text tiles in markdown 
format. Markdown is like text, but has a few (human readable) tags to allow conversion to nicer 
looking HTME with optional bold/italic script, lists, images, links and inline code. The MESS 
program txt2html does this conversion, adding support for code blocks and Eatex formulas on 
top of markdown. So the developer only cares about the . c / . cxx tiles and all redundant work is 
automatically done. 

2.5 Versioning 

Whenever the MESS library is built, git, which is used as version control system for the 
MESS code, is queried to return the current version of the commit from which the library is to be 
built. That data along with time and date are then written into an automatically created file, which 
is linked into the MESS library. All programs linked to the MESS library can now query these data, 
so the developer does not need to keep track of the versions of his software, and in bug reports, the 
users can provide the version of the library. 

2.6 Logging 

MESS provides an infrastructure for global logging and per-module logging, so each module 
can have its own log file and log level, and it can also write to the global logger. 

3. Data Structures 

MESS uses simple C structures, so developers can write clean code instead of using nested 
getter/setter methods of classes. Porting code, for example to GPUs, and writing wrappers for 
other languages is also much easier. Since all important data structures contain type and size 
information, it is possible to: 
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• mix several messages in a single stream, 

• read new data with old code and vice versa, because messages with unknown tags can be 
skipped 

• go through a file quickly until an event with a certain id is found, without reading and decod¬ 
ing all the other events into memory. 

Only two different data structures are currently needed for handling all the different data of the 
experiment: 

Shower data is stored in the Event structure, which has an id, a timestamp and an array 
of telescope events. A telescope event has the telescope id, the pixel intensities, the times of 
maximum and the list of ids of significant pixels. Events are stored in the Regions-Of-Interest file 
formal (ROI) [5], which allows lo slore Ihe full camera image, a pixel lisl or a region of inleresl in 
fhe image. 

Evenl paramelerisalions and subsystem dala are stored in a Par set slruclure, which has an 
id, a limeslamp and n parameters, each of Ihem being a veclor. Paramefer sels are slored in EITS 
fables wifh 32 bif floating poinf precision. 

4. Disk storage 

MESS requires fhe even! dala to be separaled from even! paramelerisalions and subsysfem 
dala. This keeps fhe dala sfrucfures, interfaces and file formals significanfly simpler and allows 
users to access all lhal dala wifh much less efforl. Differenl calibralions, resulls of updaled recon- 
slruclion algorilhms elc. are slored as addilional exlensions (HDUs) in fhe same EITS file. This 
keeps fhe directories clean. To prevenl excessive file access, fhe resulls of fhe mosl frequenl queries 
(like nighlly, monlhly and yearly summaiies) can be slored in fhe respective direclories. Evenl dala 
should be divided into chunks of lenglh ~ I 5 , because lhal allows parallelisation by simply sending 
chunks to differenl computing nodes. All files are stored in a directory slruclure similar to Ihe one 
shown in figure 1 . By using Ihis storage scheme instead of a dalabase, Ihe full power of Ihe shell 
is al hand and il becomes easy to access Ihe dala. The Hillas parameters of one nighl, for example, 
can be accessed wilh / data/2015/01/31/ */re con struct ion/hillas. fits. 



Eigure 1: Example of a directory slruclure suilable for MESS; evenls are stored in chunks. 
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5. On-the-fly selection 

Since parameterisations and subsystems are stored in FITS tables, the powerful column and 
row selection mechanisms of CFITSIO can be used if the user wants to read only a subset of the 
data. This way, most of the common queries can be done on the command line - without a database 
and without writing dedicated programs. For example, if the user program shall read log(£') of all 
3-telescope events, it is enough to write: 

program —in "me. f i t s [ 1 ] [NVALID( d) == 3][col logE = log ( mc_energy ); d=mc_impact_dist ]" 

Accessing vector columns is also possible. This example shows how to get the impact distance of 
the fifth telescope for all events with an energy above 1 TeV: 

program —in "me. f i t s [ 1 ] [ mc_energy > l][eol me_energy ; d=me_impaet_dist [5 ] ]" 

In both examples, the input file is filtered and Iransformed by CFITSIO according to the expressions 
in the square brackets, and the user program then reads from that filtered table. 

6. Modules 

A module is the smallest functional unit in a MESS pipeline and it can have multiple inputs, 
process them and return multiple outputs. It is defined in its own source code file and must expose 
at least an init, an exec and an exit function. From this, the shared object file is generated, which 
can then be dynamically loaded by the pipeline program. When loaded, parameters can be passed 
to the module’s init function and there be accessed as int arge, char **argv, just like 
in a standalone program. If there are more functions in the source file and if they are public 
(non-static), they are made available as library functions. The module hi lias, for example, 
has the three obligatory module functions hillas_init (...), hillas_exec (...) and 
hillas_exec {...), but it also has the function hillas_televent (...), so the users 
can either define their pipeline with modules on the command line or write programs the traditional 
way: an executable calling libraries. 

7. Pipelines 

A MESS pipeline is a set of modules, which are executed in a defined order. Each module can 
access the output of one or more other modules, but circular dependencies must be avoided. Mod¬ 
ules without parents usually read from files and then pass their data on to their children. Modules 
without children usually write to files or display a plot. Pipelines can be created on the command 
line by giving the types and names of the modules, their parent/child relations and their parameters. 
The syntax for that is: type . name : pi, p2,. . -pari vail -par2 val2 ..., where 
pi, p2, ... is the list of parent modules to receive data from and pari, vail, ... are the 
parameter/value pairs of a module. A depth-first topological sorting algorithm then resolves the 
dependency graph and returns the order in which the modules have to be initialised and executed. 
Although nothing has to be compiled, it still runs as fast as a hand-written program containing the 
module calls in the correct order, because only pointers are passed between modules. Even very 
complex pipelines covering the complete analysis chain can be easily defined on the command line 


5 



MESS 


Ramin Marx 


mess —graph gl.dot —pipeline \ 

readroi.r: —in gamma. r o i , \ 
dup.dl:r , dup.d2:r , dup.d3:r , \ 

cleanmn . c 1 : dl —m 3 —n 6 , cleanmn . c2 : d2 5 —n 10 , cleanmn . c3 : d3 10 —n 20 , \ 

hillas.hl:cl , hillas.h2:c2 , hillas.h3:c3 , \ 

writeps . wpsl : hi —out gamma_hillas_03_06 . f i t s , \ 

writeps . wps2 : h2 —out gamma_hillas_05_ 10 . f i t s , \ 

writeps . wps3 : h3 —out gamma_hillas_ 10_20 . f i t s , \ 

writeroi . wroil : cl —out gamma_03_06 . roi , \ 

writeroi . wroi2 : c2 —out gamma_05_10 . roi , \ 

writeroi . wroi3 : c3 —out gamma_10_20 . roi , 


Figure 2: Example of a MESS pipeline doing three different image cleanings, calculating the Hillas 
parameters and storing them to different files. 



Eigure 3: Graph corresponding to the pipeline defined above. 


or in scripfs, wifhouf involving exfernal libraries or fhreads. Eigure 2 shows an example of a MESS 
Pipeline fhaf reads full-camera images, applies differenl cleanings, calculafes fhe Hillas paramefers 
and wrifes fhe resulfs fo disk, wifh fhe corresponding graph shown in figure 3. 

8. Synchronisation 

Since each evenf and paramefer sef carries fhe global even! lime, if is possible fo synchronise 
among differenl readers, for example for subsysfem dala, evenfs or evenf parameterisalions. The 
sync module in MESS does fhis and if can be combined wifh fhe on-lhe-lly seleclion. In fhe fol¬ 
lowing example (see figures 4 and 5), MC evenfs, fheir Hillas paramefers and Iheir MC information 
are read from fhree differenl files and synchronised such fhaf only Ihose evenfs enter the pipeline 
that have Hillas parameters for more than three telescopes (NVALID {hillas_w) > 3), an en¬ 
ergy of more than 1 TeV (mc_energy > 1) and a mean impact distance of less than 100 m 
(AVERAGE (mc_impact_dist) < 100): 
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mess —graph g2.dot \ 

— pipeline \ 

sync . s : —key id , \ 

readroi.rl:s —in gamma, roi , \ 

readps.r2:s —in " gamma_hillas_05_l 0 . fits [ 1 ] [NVALID( hillas_w) > 3]" , \ 

readps . r3 : s —in "gamma_mc . f i t s [ 1 ] [ mc_energy > 1 && AVERAGE( mc_impact_dist) < 100]" , \ 

writeroi .wl: rl —out gamma_selected . roi , \ 

writeps . w2: r2 —out gamma_selected_hillas . f i t s 


Figure 4: Example for a MESS pipeline using the synchronisation module. 



Eigure 5: Graph corresponding to the pipeline defined above. 


9. Plotting 

MESS provides a program to plot histograms of table columns of the EITS files given on the 
command line. Through CFITSIO column and row selection, the desired parameter and its range 
to be plotted can be specified. The following two examples assume that the Hillas parameters 
of 3,6-, 5,10- and 10,20-cleaned images have already been calculated. Eigure 6 shows the three 
distributions of Hillas length on the left. On the right, Hillas width and Hillas length of 5,10- 
cleaned images are plotted. 

mess . plothist —nbins 40 —in \ mess . p 1 othi st —in \ 

" gamma_hillas_03_06 . f i t s [ 1 ] [ col hillas_l]"\ " gamma_hillas_05_ 10 . f i t s [ 1 ] [ c ol hillas_w]"\ 

" gamma_hillas_05_ 10 . f i t s [ 1 ] [ col hillas_l]"\ " gamma_hillas_05_l 0 . f i t s [ 1 ] [ c ol hillas_l]" 

"gamma_hillas_10_20. fits [l][col hillas_l]" 
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Eigure 6: Hillas length for different image cleanings and Hillas width compared to Hillas length. 
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10. Event display 

MESS provides the infrastrueture for writing flexible event displays and supplies the user with 
a demo program using that funetionality. Displaying the whole array with the triggered teleseopes 
and eamera images is possible, as well as a detailed magnified view of the individual teleseope 
events, showing either full eamera images or only the regions of interest. Eaeh drawing routine 
ean reeeive arbitrary parameters to draw event parameterisations on top of events, ehange the eolor 
palette ete. The resulting images ean then be exported to different formats, for example png, eps or 
pdf. 

11. Conclusion 

MESS is a software framework designed for data proeessing in y-ray astronomy, with empha¬ 
sis on modularity, effieieney and simplieity. It eomplies with the Unix philosophy and its programs 
ean be easily embedded in seipts. Its library allows developers to write modules and programs 
quiekly, and with few lines of eode. The library funetions ean be used in C and C-i-i- or wrapped 
for seripting languages like Python. 

End users ean define pipelines on the command line, which gives them much more flexibility 
than with config files, but without the need for programming or even scripting. Several examples 
have shown how MESS pipelines can handle complex tasks that usually require writing a dedicated 
program. Despite this flexibility, there is no degradation in performance or robustness, because 
MESS modules are shared libraries that are selectively pulled in. 

Currently, MESS can read raw CTA MC data and perform all necessary steps to produce Gam¬ 
ma/Hadron separation plots from it, so more modules need to be developed for a complete CTA 
pipeline. MESS is free software and can be downloaded from http://www.mpi-hd.mpg.de/ rmarx/mess. 
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